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1.  Introduction 

Consider  the  system  of  N  linear  equations 
(1)  A  x  •  b  , 

where  the  coefficient  matrix  A  is  symmetric  and  positive  definite.  When  A  is 
large  and  sparse,  the  preconditioned  conjugate  gradient  (PCG)  method  is  an 
effective  means  for  solving  CD  [2,  4,  5,  9,  13].  Given  an  initial  gues6  Xq, 
we  generate  a  sequence  {x^}  of  approximations  to  the  solution  x  as  follows: 


(2a)  p0  "  r0  •  b  ■  A*0* 


(2b)  Solve  Mr^  *  r^ 


FOR  k  -  0  STEP  1  UNTIL  Convergence  DO 
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Tha  effect  of  the  preconditioning  matrix  M  is  to  increase  the  rate  of 
convergence  of  the  basic  conjugate  gradient  method  of  Bcstenes  and 
Stiefel  (111.  The  number  of  multiply-adds  per  iteration  is  just  5N,  plus  the 
number  required  to  form  Ap^,  plus  the  number  required  to  solve  Hr'  ■  r^. 
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One  widely  used  class  of  preconditionings  are  the  incomplete  LDLC 
factorizations 

(3)  M  -  (D+L)  D'1 * 3  (D+L)1  , 

where  A  ■  L+D+L1,  L  is  strictly  lower  triangular,  and  D  and  5  are  positive 
diagonal.  Thi6  class  includes  the  SSOR  [9],  Dupont-Kendall-Rachford  [7], 
Generalized  SSOR  [1],  1CCG(0)  [13],  and  MICCG(O)  [10]  preconditionings. 
Letting  NZ(A)  denote  the  number  of  nonzero  entries  in  the  matrix  A,  a 

straight-forward  implementation  of  PCG  with  a  preconditioning  from  this 

1  2 
class  would  require  6N+2KZ(A)  multiply-adds  per  iteration. 

In  this  brief  note,  we  show  how  to  reduce  the  work  to  8N+KZ(A) 

multiply-adds,  asymptotically  half  as  many  as  the  straight-forward 
3 

implementation.  We  give  details  in  Section  2,  and  consider  6ome 
generalizations  in  Section  3. 

2.  Implementation 

The  linear  system  (1)  can  be  restated  in  the  form 


1  Writing  M  as  (D+LHl+D^L*),  we  solve  Mr'  ■  rfc  by  solving  the  triangular 
systems  (D+Dt^  »  r^,  (I+D  *Ll)r£  “  t^. 

1  2N  (respectively,  N)  multiply-adds  can  be  saved  by  syssoetrically  scaling 
the  problem  to  make  5  ■  I  (respectively,  D  ■  I). 

3  A  similar  speedup  for  pairs  of  linear  iterative  methods  is  given  in  [6]. 
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(4)  [(D+L)"1  A  (D+L)_t]  KD+L)*  x]  -  [(D+L)-1  b] 
or 

(5)  A  x  ■  b  . 

But  applying  PCG  to  (1)  with  M  *  (D+L)D  *(D+LC)  is  equivalent  to  applying  PCG 
to  (5)  with  M  ■  D  and  setting  x  ■  (D+L)  x.  If  we  update  x  instead  of  x 

at  each  iteration,  algorithm  (2)  becomes: 

(6a)  P0  -  S0  -  b  -  Axq 

(6b)  Compute  r^  *  Dr^ 

FOR  k  *  0  STEP  1  UNTIL  Convergence  DO 

(6c)  ak  -  (rk,r')  /  (pk,Apk) 

(6d)  Vi  "  *k  +  ;k(5+L)_tPk 

(6e)  Vl  *  *k  ’  V*k 

(6g)  Compute  r'+1  -  ®*k+1 

*  Both  are  equivalent  to  applying  the  basic  conjugate  gradient  method  to 
the  preconditioned  system 

Ax  ■  [D1/2(S+L)_1  A  (D+L)_tD1/2]  lff"1,2(D+L)t  x]  -  (D1/2(D+L)-1  b]  ■  b 


(see  I4J,  pp.  58-59). 
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,  A  A  , 


(6g)  bR  -  <rk+1*rk+1>  I  (rk*rk) 


(6h) 


pk+l  "  rk+l  +  bkpk 


Ap^  can  be  computed  efficiently  by  taking  advantage  of  the  following 
identity: 

( 7 )  Apk  -  (D+L)"1  [ (D+L)  (D+L)1  -  (2D-D)]  (D+L)"*  pfc 

-  (D+L)_tpk  ♦  (D+L)'1  [pk  -  K(D+L)_tpk]  , 

where  K  ■  2D-D.  Thus 


(8a)  tk  -  (D+L)-t  9k 

(8b)  Apk  -  tk  +  (D+L)"1  (?k  -  Ktk)  , 


which  requires  2N+NZ(A)  multiply-addB.  tk  can  also  be  uBed  to  update  ^ 
(6d),  so  that  the  total  cost  for  each  PCG  iteration  is  just  8N+NZ(A) 
multiply-adds , ^  versus  6N+2NZ(A)  for  the  straight-forward  implementation. 


in 


3.  Generalizations 

The  approach  presented  in  Section  2  extends  immediately  to 
preconditionings  of  the  form 


3  Again,  3N  multiply-adds  can  be  saved  by  symmetrically  scaling  the  problem 
so  that  D  ■  I. 
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(9)  M  -  (5+0  S'1  (D+L)1  , 

where  1  is  positive  diagonal.  Moreover,  if  we  take  K  ■  D+Dt-D  in  (7)  and 
(8),  then  D  need  not  be  diagonal  or  even  symmetric.  In  this  case,  5  would 
reflect  changes  to  both  the  diagonal  and  off-diagonal  entries  of  A  in 
generating  an  incomplete  factorization.  If  we  assume  that  only  the  nonzero 
entries  of  A  are  changed,  i. e. ,  that  (K)^  is  nonzero  only  if  (A)^  i6 
nonzero,  then  the  operation  count  is  7N+NZ(A)+NZ(K). 

Another  application  is  to  preconditioning  nonsymmetric  systems.  Let 

(10)  M  -  (D+L)  S'1  (D+U)  , 

be  an  incomplete  LDU  factorization  of  a  nonsymmetric  matrix  A,  where 
A  ■  L+D+U,  L  (respectively,  U)  is  strictly  lower  (respectively,  upper) 
triangular,  and  0  and  S  are  diagonal.  Then  a  number  of  authors  have  proposed 
solving  the  linear  system  Ax  »  b  by  solving  the  normal  equations  for  one  of 
the  preconditioned  systems 

(11a)  AjX  •  [S  (D+L)'1  A  (D+U)'1]  [(D+U)  x]  -  [S  (D+L)"1  b]  ■  b 
(see  [12])  and 

(lib)  A2x  •  [(D+U)"1  S  (D+L)"1  A]  x  -  [(D+U)"1  S  (D+L)"1  b]  ■  b 
(see  [14,  3]).  A2p  can  be  computed  as 
(12)  A2P  •  (D+U)"1  S  [p  ♦  (D+L)"1 (D+U-D)p] 
in  4N+NZ(L)+2NZ(U)  multiply-adds,  whereas  A^p  can  be  computed  as 
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(13a)  t  -  (D+U)-1p 

(13b)  AjP  -  S  It  ♦  (Dt-L)'1  (p  -  ( 2D-D )  t ) ) 

in  4N+NZ(L)+NZ(U)  multiply-adds.  Thus  the  first  approach  would  be  more 
efficient  per  iteration,  although  more  iterations  might  be  required  to 
achieve  comparable  accuracy  [14].^ 
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The  same  would  be  true  if  a  Generalized  Conjugate  Residual  method  such  as 


Orthomin  [15,  8]  were  used  to  solve  (11a)  or  (lib). 
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